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Abstract. CMB anisotropy data could put powerful constraints on theories of the evolution of our 
Universe. Using the observations of the large number of CMB experiments, many studies 
have put constraints on cosmological parameters assuming different frameworks. Assuming 
for example inflationary paradigm, one can compute the confidence intervals on the different 
components of the energy densities, or the age of the Universe, inferred by the current set 
of CMB observations. The aim of this note is to present some of the available methods to 
derive the cosmological parameters with their confidence intervals from the CMB data, as 
well as some practical issues to investigate large number of parameters. © 2003 Academie 
des sciences 
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Abstract. Les observations des anisotropies du fond diffus cosmologique (FDC) peuvent placer de 
fortes contraintes sur les theories devolution de notre Univers. L'utilisation de telles donnees 
a permis de contraindre differents parametres de differents cadres theoriques : l'age de l'U- 
nivers, son contenu baryonique, etc. Le but de cette contribution est de presenter differentes 
methodes possibles pour extraire les parametres cosmologiques et leurs intervalles de con- 
fiance des donnees du FDC. Des questions pratiques sur l'utilisation de grands nombres de 
parametres sont aussi abordees. © 2003 Academie des sciences 
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1. Introduction 

The extraction of information from cosmic microwave background (CMB) anisotropics is a classic prob- 
lem of model testing and parameter estimation, the goals being to constrain the parameters of an assumed 
model and to decide if the best-fit model (parameter values) is indeed a good description of the data. Max- 
imum likelihood is often used as the method of parameter estimations. Within the context of the class of 
models to be examined, the probability distribution of the data is maximized as a function of the model pa- 
rameters, given the actual, observed data set 1 . Once found, the best model must then be judged on its ability 
to account for the data, which requires the construction of a statistic quantifying the goodness-of-fit (GoF). 
Finally, if the model is retained as a good fit, one defines confidence intervals on the parameter estimation. 
The exact meaning of these confidence intervals depends heavily on the method used to construct them, but 
the desire is always the same - one wishes to quantify the 'ability' of other parameters to explain the data 
(or not) as well as the best fit values. Given the quality of the current data, and the aim of the analysis - 
precise determination of the cosmological parameters - much attention should be put on the robustness and 
accuracy (unbiased techniques) of the methods used. I review the different ways of estimating the likelihood 
function of the parameters focusing on the use of the angular power spectrum (C/s). Then, some methods 
to compute the goodness of fit and the confidence intervals will be discussed. Finally, some practical issues 
for such computations will be addressed. In this review, I take the temperature fluctuations as the observed 
quantity. The same approaches could be applied for the polarisation signal of the CMB. 

2. Likelihood 

Data on the CMB consists of sky brightness measurements, usually given in terms of equivalent temper- 
ature in pixels. The likelihood function is to be constructed using these pixel values 2 . Standard Inflationary 
scenarios predict Gaussian sky fluctuations, which implies that the pixels should be modeled as random 
variables following a multivariate normal distribution, with covariance matrix given as a function of the 
model parameters (in addition to a noise term). It is important to note that, since the parameters enter 
through the covariance matrix in a non-linear way,the likelihood function C is not a linear function of the 
(cosmological) parameters. 

Although it would seem straightforward to estimate model parameters directly with the likelihood func- 
tion from the maps (full analysis), in practice the procedure is considerably complicated by the complexity 
of the model calculations and by the size of the data sets (Q]E1|3]|4)). Maps consisting of several hundreds 
of thousands of pixels (the present situation) are extremely cumbersome to manipulate, and the million- 
pixel maps expected from Planck cannot be analyzed by this method in any practical way. An alternative is 
to first estimate the angular power spectrum from the pixel data and then work with this reduced set of num- 
bers. For Gaussian fluctuations, there is in principle no loss of information. Because of the large reduction 
of the data ensemble to be manipulated, the tactic has been referred to as "radical compression" (|2|). The 
power spectrum has in fact become the standard way of reporting CMB results; it is the best visual way to 
understand the data, and in any case it is what is actually calculated in the models. The first part of this sec- 
tion describes briefly the full analysis procedure. Then, the second part will focus on the power spectrum as 
starting point for cosmological parameters estimation. In the latter case, due to the non-Gaussian behavior 
of the Ct's, elaborated approximations should be used. 

2.1. Full analysis 

Temperature fluctuations of the CMB are described by a random field in two dimensions: A(n) = 
(ST/T)(n), where T refers to the temperature of the background and it, is a unit vector on the sphere. It is 



'In recent Bayesian analyses, quoting the mean of the product of the likelihood and prior functions as best model, is preferred 
2 The term pixel will be understood to also include temperature differences. 
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usual to expand this field using spherical harmonics: 

A(n) = ae m Ye m (n) < ai m a% m , > e „ s = Ci5 u ,6 mm > (1) 

The ai m 's are randomly selected from the probability distribution characterizing the process generating the 
perturbations. In the Inflation framework, which we consider here, the ag m 's are Gaussian random variables 
with zero mean and covariance 3 given in Eq. 1. The C/s then represent the angular power spectrum. We 
may express the observed (or beam smeared) correlation between two points separated on the sky by an 
angle 9 as 

C b (9) =< A 6 (ni)A b (n 2 ) > ens = i- ^{2t + \)C t BjP t {p) (2) 

i 

where Pg is the Legendre polynomial of order I, fi= cos 9 = h\ ■ fi2 and Bg is the harmonic coefficient of 
the beam decomposition 4 . The statistical isotropy of the perturbations demands that the correlation function 
depend only on separation, 9, which is in fact what permits such an expansion. 

Given these relations and a CMB map, it is now straightforward to construct the likelihood function, 
whose role is to relate the iV p j X observed sky temperatures, which we arrange in a data vector with elements 
di = Ab(ni), to the model parameters, represented by a parameter vector 0. For Gaussian fluctuations 
(with Gaussian noise) this is simply a multivariate Gaussian: 

C(S) = Prob(~d |8) = —^^ e-^^ (3) 

The first equality reminds us that the likelihood function is the probability of obtaining the data vector given 
the model as defined by its set of parameters. In this expression, C is the pixel covariance matrix: 

dj =< didj >ens= Tij + Nij (4) 

where the expectation value is understood to be over the theoretical ensemble of all possible universes re- 
alizable with the same parameter vector. The second equality separates the model's pixel covariance, T, 
from the noise induced covariance, N. According to Eq.|2] = C7j(0j,-) = 1/(4tt) J2e(%£+ l)CgWij(£) 
where W, the window matrix, contains the beam and strategy effects (direct measure, differences). The 
parameters may be either the individual Cg (or band-powers, discussed below), or the fundamental cos- 
mological constants, Q, H , etc... In the latter situation, the parameter dependence enters through detailed 
relations of the kind C^[0], specified by the adopted model (e.g., Inflation). 

For cosmological parameters estimations, one has to compute the likelihood value of Eq.|5]for a family of 
models investigated. For each set of parameters 8 , the computational time for the likelihood goes like iVp ix , 
unless geometrical symmetries in the observational strategy allows to use faster algorithm for inverting C. 
Investigating one handful of parameters with reasonable steps and ranges (typically N parameters) w i tn a 
map of few thousands of pixels becomes extremely cumbersome. Only few studies has been done in such a 
way ( ll6U17lfT8lll9l . Such a computation with second generation experiments is thus prohibitive 5 . 

2.2. Using the Angular Power Spectrum 

In order to avoid the problem of computational cost of the full analysis, an alternative consists in first 
estimating the angular power spectrum from the pixel data and then work with the latter to estimate the 
cosmological parameters. The critical issue is then how to correctly use the power spectrum for an unbiased 

3 The indicated averages are to be taken over the theoretical ensemble of all possible anisotropy fields, of which our observed CMB 
sky is but one realization. 

4 Note that this expansion pre-supposes axial symmetry for the beam 
5 except for some particular symmetries, |5) 
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Figure 1: Angular power spectrum estimates of the CMB anisotropies in September 2003 (171 151 l9l 1101 
II 1111 21 rT~3l 1 1 41 1 1 5 1 ^ . Notice the good agreement between band powers coming from different experiments 
(different detectors, technology, scanning strategy,...) until I ~ 500. 



parameter estimation and model evaluation. The angular power spectrum can be evaluated with different 
techniques (see Hamilton, this issue, |6|). Again, a likelihood analysis from the maps can be done by 
inserting a spectral form into the definition of T. For example, the commonly used^af band-power, STfb 
(or Cb — 5T^ b over a certain range in £), actually represents the equivalent logarithmic power integrated 
over the band, which simplify the correlation matrix as follows: 

C t = 2n[STf b /(t(e +1)] T = l -8Tl £ Q*±^ W {t) (5) 

In this way, we may write Eq.|3]in terms of the band-power and treat the latter as a parameter to be estimated. 
This then becomes the band-power likelihood function, £(<5Tfb). Figure ^shows the latest band power 
estimates of the CMB fluctuations. Some of the points have been obtained by maximizing this likelihood 
function; the errors are typically found by in a Bayesian approach, by integration in Cb over C with a 
uniform prior (eg. DASl(T2]|, VSAQjJ, CBlfPfl. ACBARJ 15 1). Other band powers and errors are estimated 
by using Monte Carlo based methods (see like the WMAPQ, BOOMERANG 1 10 1, MAXIMAfTT1 

and Archeops|9| ones. Notice that the variance due to the finite sample size (i.e., the sample variance, 
including the cosmic variance due to our observation of one realization of the sky) is fully incorporated into 
the analyses. 

Given a set of band-powers how should one proceed to constrain the fundamental cosmological param- 
eters, denoted in this subsection by 9? If we had an expression for C(6Tfb), for our set of band-powers 
5Jh>, then we could write £{SfZ) = Prob(d\5T^) = Prob(d \SfZ[0}) = £(0). Thus, our problem is 
reduced to finding an expression for C(ST[b), but as we have seen, this is a complicated function of STr,, 
requiring use of all the measured pixel values and the full covariance matrix with noise - the very thing we 
are trying to avoid. Our task then is to find an approximation for C(6Tfb). 
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2.2.. 1 x 2 minimization 

The most obvious way of finding "the best model" given a set of points and errors is the traditional \ 2 ~ 
minimization. This means that we assume a Gaussian shape for the likelihood function of the kind: 

where M is the correlation matrix between the different fiat band estimates. The main problem with this 
approach is that it deals with the fiat band-power estimates as Gaussian distributed data which they are not 
(obeying the statistics of the square of a Gaussian). Then, it has been shown that such a procedure gives 
a biased estimation of the cosmological parameters and bad estimates of the confidence intervals ([2 20 1), 
leading to the search of more accurate approximation of the likelihood function. 



2.2..1 More elaborated approximations 

Different studies have been made to reconstruct better analytical approximations directly from the form of 
the flat band likelihood function (|2 20 21 1). This section will focus on two of them. One is derived from 
the likelihood function in a particular case, for which it is actually exact [20|. The other one, mostly used 
during the last years, offers the advantage of being really close to a \ 2 minimization by changing variables 
in the appropriate way |2|. Both approximations need a small amount of information and aim to be used 
directly from the spectrum given in the literature 6 . 

• BDBL approximation 

The Bartlett, Douspis, Blanchard and Le Dour approximation is based on the analytical form of the likeli- 
hood in an "ideal" experiment, where all the pixels (iV p i x ) are independent random variables (uncorrelated) 
and the noise is uncorrelated and uniform (c^). In that particular case, one can write the exact likelihood 
function as follows: 

where j3 = cr^ and v = N p [ x . The approximation comes from the fact that the authors keep the same like- 
lihood form for real experiments, whereas the noise is no longer uniform and uncorrelated, and the pixels 
are not independent. To take into account these differences, one lets v and (3 as free parameters and fixes 
them by fitting the 68% and 95% confidence intervals (published or inferred from the true likelihood func- 
tion). Figure |3 shows the comparison between this approximation and true likelihood functions obtained 
for TOCO data (HUE!). 

The advantage of this approximation is that the better the behavior of the experiment is (less correlations, 
more uniform noise, ...), the better the approximation is; being exact for the ideal case. It is also unbiased 
at the maximum of the likelihood and allows one to recover the full shape of the likelihood function. The 
inconvenient of this approximation is that it is defined only for uncorrelated flat band powers. The possible 
correlations between bands are not taken into account, as the full likelihood is given by the product of all 
individual likelihood functions: C(SfZ) = IIA^fb)- 

• BJK approximation 

In the second case, also referred as the Bond, Jaffe & Knox approximation the motivation is driven by 
the need to work with Gaussian distributed variables for which the \ 2 is not biased any more. Writing 

6 As we will see these approximations need one more information than the basic \ 2 minimization in order to take into account the 
non Gaussian behavior of the likelihood. The authors have been asking that this information is provided in addition to the band powers 
estimates and errors. Recent experiments have published the necessary information in their papers. 
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Figure 2: Comparison to the TOC097 likelihood function for all approximation described in this section. 
The black solid line shows the full likelihood function computed from the map. The red dot-dashed and blue 
dashed lines show the BDBL and BJK approximation respectively. The 2-wings Gaussian and WMAP- 
type approximation are plotted in dotted green and dot-dot-dashed yellow lines 

the likelihood in the spherical harmonic space for the same ideal experiment as above, and considering 
Cg = £(£ + l)Cg/(2n) = 57f, Ng = £(£ + l)N e /(2ir) where N t = (\n hn \ 2 ) is the noise power spectrum 
in spherical harmonics, one can show that the curvature matrix evaluated at the maximum is proportional 
to (Cg + Aft/Bg) 8u'. If one define Zg = \n(Cg + xg) where in this particular "ideal experiment" 
xg = Ng/Bf, the curvature matrix expressed in term of Z is then constant. BJK approximation to the 
likelihood is then to take Z& (determined in a band) as normally distributed in realistic experiments (by 
finding the good expression of the corresponding Xb)- From the previous statements, one can then express 
the likelihood by: 

£(5T^) = e (--Z-A4'- 1 -2 t )/2. where z . = \ n [5T'l{i) + x b (i)) evaluated in a band i (8) 

The absolute value of Eq.[8]gives also an estimate of the goodness of fit. As we will see below, this approxi- 
mation is slightly biased at the maximum of the likelihood but has been shown to be a reliable approximation 
(see Fig. |3 and is available online through the RADPACK package of ,: I22I . 

The WMAP team 1251 adopted an hybrid approximation: ln£ = -| hiCcauss + § ^^-bjk motivated 
by an expansion of the true likelihood around the maximum 7 . This formulation has the advantage to be 
unbiased around the maximum but has not been tested against the real likelihood function in the wings. 

Once the likelihood function C(ST[b) is known, one is able to compute C(Q) for a family of models. As 
we have seen, temperature on the sky are random Gaussian variables and then the "radical compression" 
is thus valid and induces no loss of information. The latter is true only if all the spectrum (in the limit of 
sensitivity of the experiment; the window function) is specified. Whereas, for different reasons (partial sky 
coverage, noise correlation, ...) only the spectrum in band is recovered: the spectrum is approximated by 
steps in I, Such description induces a loss of information which may have some effect on the cosmological 
parameter estimation (bias and degeneracies). Douspis et al. ([ 19 1) have shown that a better description of 
the spectrum (power in band and slope in band) could decrease the bias. The second and third generation 
of experiments provides (or will provide) better sensitivity, less correlated measurements which allows one 
to recover the spectrum with better resolution in I (see WMAP for ex.), decreasing therefore the bias. Most 
of the studies are nevertheless performed by using the set of flat band estimates likelihood functions. 



7 where £ Gatlss = exp(-x 2 /2) 
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3. Goodness of fit 

Once the (approximated) likelihood values of the models investigated are computed, one should find the 
best model (maximum) and evaluate the quality of the fit before constructing the parameter constraints. 
As a general rule, one must judge the quality of the fit before any serious consideration of the confidence 
intervals on parameters. This requires the application of a Goodness of fit (GOF) statistic. The latter is 
usually a function of both the model and the data, which reaches a maximum (or minimum) when the 
data is generated from the theory. The 'significance' may then be defined as the probability of obtaining 
gof > gofobs- On this basis, it permits a quantitative evaluation of the quality of the best model's fit to the 
data: if the probability of obtaining the observed value of the GOF statistic (from the actual data set) is low 
(low significance), then the model should be rejected. Without such a statistic, one does not know if the 
best model is a good model, or simply the "least bad" of the family. 

In the full likelihood analysis method, the best model (set of parameters) could be obtained by maxi- 
mizing the likelihood function of Eq|3]and is defined by Qbest in the following. One can easily note the 
Gaussian form of Eq|3]in the data vector d . Given the best model, the most obvious GOF statistic is then 

clearly gof = d l ■ C ■ d where C = C( 6 best ) is the correlation matrix evaluated at the best model. 
For the Gaussian fluctuations we have assumed, this quantity follows a \ 2 distribution, with a number of 
degrees-of-freedom (DOF) approximately equal to the number of pixels minus the number of parameters 8 . 

The use of \ 2 method (in STft, or any change of variables like in BJK approximation) makes even easier 
the computation of the GOF. The obvious GOF statistics would just be one number, the value of the \ 2 
evaluated at the minimum: gof = x 2 (0& e st)- It is of course true that if the number of contributing 
effective DOF is large, a power estimate will closely follow a Gaussian; this, however, is never the case on 
the largest scales probed by a survey. Douspis et al., |28|, have shown for example that the \ 2 approach 
leads to quantitatively different results than other, more appropriate GOF statistics. 

When more elaborated approximations are used, the goodness of fit computation is less obvious. One 
should first reconstruct the distribution of the estimators. This could be a natural output when Monte 
Carlo based methods are used for the Cg's extraction 1261 1271 . but it is mostly unknown when one applies 
traditional methods. Douspis et al. (28 1 have proposed an approximation which allows to reconstruct the 
distribution from the shape of the flat-band likelihood function 9 . When the latter is known, one should build 
a GOF statistique in order to compute to data probability given the best model (see |28 1 for examples). 

Knowing that the best model is indeed a good fit to the data, or that the data have a good chance to be 
generated from this model, one should proceed by estimating the confidence intervals on the investigated 
parameters. 

4. Confidence Intervals 

The estimation of confidence intervals is mostly a question of definition. Most of CMB analyses have 
been done in the Bayesian framework and are thus dependent on the priors assumed. Some frequentist 
attempts have been performed in order to eliminate such dependencies. The reader can read more about the 
comparison between the two methods in |25). 

Typically, the frequentist analyses are related to the goodness-of-fit statistics and the probability distribu- 
tion of data for a given model (set of parameters). In the Bayesian approach, one reconstructs the conditional 
(posterior) probability density function (pdf), P(0 tr „ e |c? ft s ), for the unknown 9 true given the observation 
dobs 10 , from the pdf (which dependency in O is known) for observing d using Bayes theorem. The latter 

8 This recipe does not strictly apply in the present case, because the parameters are non-linear functions of the data; it is nevertheless 
standard practice. In any case, the number of pixels is in practice much larger than the number of parameters. The numbers of 
degrees of freedom is also less than the number of pixels because of the correlations between pixels (non diagonal correlation matrix). 
Nevertheless, the matrices are mainly diagonal and the gof is then mostly insensitive to the small reduction of the number of DOF. 

'This technique could be used both ways, allowing to reconstruct the likelihood when the distribution is known 

10 In our problem, ci j, s should be taken as a set of flat band powers 
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evaluated at d obs is known as Likelihood function: P(d obs \ Q ) = £(d obs )■ 

P(Qtrue\d ob s) = £{d obs \e true )P(6 true )/P{d obs ) (9) 

The denominator is just a normalization factor, and thus one of the issues is what to use for the prior 
Pi® true)- If one knows the likelihood and fixes the prior (usually taken as uniform in terms of the pa- 
rameters) then one knows the posterior probability distribution. A Bayesian credible region (interval) for a 
parameter is the range of parameter values that encloses a fixed amount of such probability. As the questions 
asked in the two approaches are quite different, one does not expect necessarily the intervals computed in 
the two methods to be similar. 

I will described in the following two ways of estimating the confidence intervals referred as marginal- 
isation and maximization. These two approaches are usually presented in opposition. In the limit of a 
Gaussian shaped likelihood with linear dependencies in the parameters, the two techniques are equivalent 
(see demonstration in 1 33 1). Unfortunately this is not the case in cosmological parameter estimation. Both 
techniques consist in two steps: first one reduces the number of parameters in order to visualize the likeli- 
hood (or pdf) function (or surface in 2 dimensions). Then one computes the confidence intervals for each 
parameter. 

4.1. Marginalisation 

In CMB analyses, as is usually a vector of 5 to 10 parameters, it is quite hard to visualize the pos- 
terior (or likelihood) distribution. It is common then to retrieve one-dimensional probability by using an 
integration method (marginalisation). This technique is mostly used in Bayesian approaches to parameter 
estimation. Let's assume that = (x, y, z) and we are interested in plotting the likelihood and finding 
the 68% confidence intervals on x, where the other parameters have been marginalised over. One usually 
computes: 



£(x) = j ... j £(d obs \(x,y, z))P(x,y, z)dyd...dz (10) 

/ £{d obs \x)dx = 0.5 / £{d obs \x)dx = 0.16 / £{d obs \x)dx = 0.84 (11) 

Jo Jo Jo 

where we assume that X is a positive variable, £ is normalized to unity and P is a uniform prior on the 
parameters 11 , [x—, x+] is then referred as the 68% confidence interval on x with all the other parameters 
marginalised over and x m is quoted as the mean value (such computation of intervals in referred as EQT 
for "equi-probability tail") 12 . This may be seem easy in one dimension but could become cumbersome 
when dealing with 10 dimensional likelihood function (especially for the multi dimensional integral of the 
marginalisation Eq. I10i . In order to be less and less dependent of all these effects, and to decrease the 
computational time of this step, maximization technique is mostly used. 

4.2. Maximization 

For the maximization technique, one assumes also a uniform prior in terms of the parameters (typically 
P(x, y, z) = 1) but defines the pdf (= likelihood then) in one dimension as: 

£(x) = maXy,..., z {£(d obs \{x,y 1 z))]P(x, y, ...,z) (12) 



1 1 The prior is usually taken as uniform in © in order to show our ignorance on the true value of , even if there is no basis in 
Bayesian theory. In that sense, the interval will depend on the choice of parameters. Assuming ~-y = 2 as parameters and thus a 
uniform prior in 7* will resume in a different interval (see Fig. 14. 2. 1 

12 Due to the non-linear dependency of the likelihood against the parameters, the shape of the latter could be highly non Gaussian. 
In such cases, it could occur that the maximum of the likelihood (described earlier as the best model) does not fall inside the 68% 
confidence interval (see for example Fig. 14. 2. 1 . In that case one should recompute the interval following the HPD (for "higher posterior 
distribution") technique, by fixing C(x—) = C(x+) and /*_ C(d obg \x)dx = 0.68. 
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Figure 3: Comparison between marginalisation and maximization estimation of confidence intervals in an 
extreme case. The black solid curve is a one-dimensional likelihood function. The blue vertical dashed 
lines mark the mean and boundaries of 68% CL interval computed by integration ([x—, x+]). In that case 
the maximum of the likelihood x = 7580 is just outside the interval. The green dotted lines are obtained 
with the same method but with taking 7 = x 2 as variable (see text). Finally the red solid vertical lines 
shows the interval computed by taking values of the likelihood higher than exp(~ A\ 2 /2) x C max , where 
here, C max = 1 and A\ 2 = 1. 

which means that for each value of x one takes the maximum of the likelihood on all the other dimen- 
sions. Then, instead of integrating the resulting one dimensional likelihood like in Eq. ^|for obtaining 
the confidence intervals, one considers the values of the likelihood. For example, the boundaries of the 
68% CL region are that where the likelihood has fallen by a factor e~ x l' 2 from its maximum, C max . As 
demonstrated in 1 33 1 this approximation becomes exact for multivariate Gaussian forms. One can define 
different CL intervals by choosing A\ 2 such as L(x a ) / C max = e~ AaX / 2 where a marks the confidence 
level. In one dimension, A a x 2 = 1, 4, 9 for respectively a = 68, 95, 99% CL. Fig. l4.2.l shows an example 
in such a case. This technique does not give the real 68, 95 etc. confidence intervals, obtained only with 
Monte Carlo simulations by definition, but it is independent of the choice of the parameter (x versus x 2 )\ 
the maximum of the likelihood is always inside every interval by definition, and it is computationally not 
consuming. Arguments and discussion about the different techniques can be found in 1 30 1 . 

5. Practical Issues 

We have seen in the previous sections some of the existing statistical tools needed to perform a proper 
cosmological parameter estimation. As one would like to investigate a large number of parameters, and 
so a large number of models, some practical issues may be taken in consideration. I will describe in the 
following two methods (and some techniques) which correspond to the two actual ways of determining 
cosmological parameters from CMB anisotropies. 

5.1. Cl's computations 

The release of CMBFAST ( 1371 ) has brought a major improvement in cosmological parameter estima- 
tions. The ability to compute a theoretical power spectrum in less than one minute (instead of one hour 
precedently) has allowed different groups to investigate many parameters in the same analysis. Different 
versions of the code have improved the first release by taking into account many physical effects (neutrino, 
reionisation, isocurvature modes, ...) and improved the computation by separating small scales effects from 
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large scales ones ("k-splitting"). Derived from this initial code, CAMB (|38|) increased the speed of com- 
putation by using FORTRAN 90 facilities. Finally, DASH (|39|) allows to compute C/s spectra in few 
seconds, by interpolating a precomputed grid of spectra in Fourier space. All these codes are more and 
more efficient and fast, and being adapted to be used in parallel computing. 

5.2. Gridding 

The gridding method consists in computing the likelihood values of different models following a pe- 
riodical increment for each selected parameter, resulting in a N param dimensional matrix. Historically, 
the parameter estimation from CMB anisotropies started with small grids of models, typically 3 or 4 free 
parameters with around 10 values each, the other ones fixed to the supposed best value of the moment 
13 II 1321 . Then the number of parameters increased with the increasing speed of computer processors and 
the development of faster codes to compute the Ce's (eg. CMBFAST) 

One of the advantages of gridding is that one can compute a grid of models, store it and then compute the 
likelihood with one's set of data. If new data come out, one has just to compute the likelihood part again. 

As the number of models investigated increases the storage could become a problem |33|. Then, some 
compression techniques, in combination with approximated interpolations, could be applied in order to 
store the necessary information only. The Cg's computation time may also become a problem. There again 
approximations based on the known behavior of the Ce's with parameters have been developed 1 33 1. 

One of the inconvenients of the gridding method is that the position of the maximum of the likelihood 
grid is highly dependent of the grid itself. Namely, the maximum falls necessarily on one point of the 
grid. This effect is also recurrent when one uses the maximization technique. In order to avoid this, spline 
interpolation techniques are used when looking for the maximum along one or more parameters 1 33 1. 

Finally, by definition, the gridding method is well adapted to multi-processors and data-grid method. 

5.3. Monte Carlo Markov Chains 

During the last few years, as an alternative to the gridding method, the Markov Chain Monte Carlo 
(MCMC) likelihood analyses had become a powerful tool in cosmological parameter estimation. This 
method generates random draws from the posterior distribution that is supposed to be a "realistic" sample 
of the likelihood hypersurface. The mean, variance, confidence levels can then be derived from this sample. 
Unlike the gridding method, scaling exponentially with the number of parameters, the MCMC method 
scales linearly with N param allowing one to explore a larger set of parameters or to do the analysis faster. 

Two issues should be highlighted in this method. The first one is the step in the random sampling. 
Typically, the step is taken as the standard deviation for each parameter. If it is too large, the chain can 
take a infinite time to converge and the acceptance rate is very low. If it is too small the chain will be 
highly correlated leading also to a slow convergence. A second issue is the convergence of the chain. At the 
beginning the sampling of the likelihood is very correlated and is not a "fair" representation of the posterior 
distribution. After a "burning period", the chain converges, the samples are independent and the likelihood 
function could be retrieve. The criterium of convergence is not a well defined quantity. 

More explanations and applications could be found in 1 34 35 1 and a FORTRAN 90 set of routines is 
available online |36 1. 

6. Conclusions 

In order to derive the cosmological parameters in a given framework from the temperature fluctuation 
of the CMB, many steps are needed. When the observed power spectrum is derived, one could use dif- 
ferent techniques to estimate successively the (approximated) likelihood value of the family of models 
(parameters) investigated, the best model and its goodness of fit, and finally the confidence intervals on 
each parameter. Each of these steps may be highly cpu and memory consuming. With better and better 
observations, sensitivity and sky coverage, brute force maximum likelihood methods become impossible. 
Many approximations and techniques have then been developed during the last years, allowing to analyze 
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more and more data with increasing speed. When the appropriate method is used, this leads to an unbiased 
estimate of the cosmological parameters. These developments have demonstrated that efficient methods 
could be developed to take full advantage of data at the Planck accuracy and allow to determine parameters 
of cosmological relevance to a remarkably high accuracy. This is opening the golden road of precision 
cosmology. 

Acknowledgements. MD would like to thank A. Blanchard, J. Bartlett and K. Moodley for useful discussions and 
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